Skip to content

feat(genai): add local/Data URI pre-upload support and configurable hooks#119

Merged
Cirilla-zmh merged 19 commits intoalibaba:mainfrom
yyuuttaaoo:docs_and_calls
Feb 27, 2026
Merged

feat(genai): add local/Data URI pre-upload support and configurable hooks#119
Cirilla-zmh merged 19 commits intoalibaba:mainfrom
yyuuttaaoo:docs_and_calls

Conversation

@yyuuttaaoo
Copy link
Copy Markdown
Contributor

@yyuuttaaoo yyuuttaaoo commented Feb 10, 2026

Description

This PR refactors and enhances the GenAI Multimodal Upload Pipeline. The primary focus is shifting to a Lazy Loading architecture, which not only avoids unnecessary resource overhead but also establishes the necessary invocation points for multimodal processing that were previously missing. Additionally, it enhances security for local file handling and provides seamless integration for AgentInvocation in Agentscope.

Key changes

  • Enhance MultimodalPreUploader to support Data URI and local file path inputs.
  • Add AgentInvocation multimodal data support.
  • Add uploader and pre-uploader entry points in pyproject.toml.
  • Support dynamic loading of uploader/pre-uploader hooks via environment variables.
  • Improve multimodal pre-upload metadata extraction and upload mode handling.
  • Implement GenAIShutdownProcessor for graceful shutdown of GenAI components.
  • Enhance README-loongsuite.rst with multimodal upload configuration and usage guidance.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test A

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

…ring

Support Uri objects with data: scheme (base64 encoded only) in MultimodalPreUploader.

- Refactor _process_message_parts to reduce redundancy:
- Add _normalize_audio_data to unify audio format detection and conversion.
- Add _check_size and _estimate_base64_size to unify size limit checks.
- Add _resolve_mime_type to unify MIME type resolution logic.

Optimize Data URI processing by estimating size before decoding.

- Merge and expand unit tests in util/opentelemetry-util-genai/tests/_multimodal_upload/test_pre_uploader.py, covering:
- Data URI processing (base64, explicit mime, invalid formats).
- Local file URI processing.
- Size limit enforcement for Blob, Base64Blob, and Data URI.
…singMixin

- Added type checks for invocation objects in async methods to ensure correct processing.
- Updated logging to use a casted logger for better type safety.
- Refactored metric recording calls to remove type ignores and enhance clarity.
- Adjusted the _record_extended_metrics method signature to accept specific invocation types.
- Cleaned up test setup by removing unnecessary reset logic for MetricsSingletonMeta.
…tion support

- Added entry points for multimodal uploader and pre-uploader in `pyproject.toml`.
- Enhanced `README-loongsuite.rst` with detailed instructions on enabling multimodal upload features.
- Implemented `GenAIShutdownProcessor` for graceful shutdown of GenAI components.
- Refactored `MultimodalProcessingMixin` to support dynamic loading of uploader and pre-uploader hooks based on environment variables.
- Updated `MultimodalPreUploader` to handle new upload modes and improved metadata extraction.
- Added tests for shutdown processor and default hooks to ensure functionality and error handling.
@yyuuttaaoo yyuuttaaoo changed the title feat: Enhance MultimodalPreUploader with Data URI and Local path support feat: Enhance MultimodalPreUploader with local path support and introduce pre-upload hooks with configuration support Feb 12, 2026
@yyuuttaaoo yyuuttaaoo changed the title feat: Enhance MultimodalPreUploader with local path support and introduce pre-upload hooks with configuration support feat: Enhance MultimodalPreUploader with local path support and introduce configurable pre-upload hooks and uploader entry points Feb 12, 2026
@yyuuttaaoo yyuuttaaoo changed the title feat: Enhance MultimodalPreUploader with local path support and introduce configurable pre-upload hooks and uploader entry points feat(genai): add local/Data URI pre-upload support and configurable hooks Feb 12, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the GenAI multimodal upload pipeline with support for Data URI and local file path inputs, adds AgentInvocation multimodal data handling, implements a configurable hook-based system for uploader/pre-uploader loading, and introduces graceful shutdown coordination for GenAI components.

Changes:

  • Add Data URI and local file path support to MultimodalPreUploader with security controls
  • Implement entry-point-based hook system for dynamic uploader/pre-uploader discovery and loading
  • Add multimodal data processing support for InvokeAgentInvocation with async handling
  • Introduce GenAIShutdownProcessor for coordinated graceful shutdown of GenAI components
  • Refactor metrics recording with _record_extended_metrics helper method

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
test_shutdown_processor.py New test file validating shutdown processor behavior with mock handlers and uploaders
test_extended_handler.py Agent multimodal async processing tests, Chinese comments for shutdown tests
test_pre_uploader_audio.py Add default upload mode fixture for audio tests
test_pre_uploader.py Comprehensive tests for Data URI, local file, and size limit handling
test_multimodal_upload_hook.py New tests for entry-point-based hook loading and validation
test_fs_uploader.py Corrected OSS environment variable names
test_default_hooks.py New tests for default fs uploader and pre-uploader hook factories
shutdown_processor.py New SpanProcessor for coordinated graceful shutdown with configurable timeouts
extended_types.py Added monotonic_end_s field to InvokeAgentInvocation
extended_handler.py Agent multimodal support, refactored metrics recording, renamed shutdown method
extended_environment_variables.py New env vars for upload mode, hook selection, local file access, changed defaults
pre_uploader.py Data URI parsing, local file reading with path validation, audio normalization refactor
multimodal_upload_hook.py New hook loading system with entry point discovery and singleton caching
fs_uploader.py Added fs_uploader_hook factory function for entry point integration
_base.py Added default shutdown method to PreUploader protocol
init.py Updated exports for new hook-based loading system
_multimodal_processing.py Agent multimodal async methods, refactored fallback/dispatch logic
pyproject.toml Entry point configuration for uploader/pre-uploader hooks
README-loongsuite.rst Documentation for multimodal upload, hook system, shutdown processor usage
CHANGELOG-loongsuite.md Changelog entry for this PR

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Cirilla-zmh Cirilla-zmh added enhancement New feature or request infra The infra label represents issues related to Infrastructure genai The genai label represents issues related to generative AI. labels Feb 13, 2026
Copy link
Copy Markdown
Collaborator

@Cirilla-zmh Cirilla-zmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! I have some concerns about the implementations which may help.

- Split multimodal upload dependencies into separate groups in `pyproject.toml`.
- Introduced new environment variables for audio conversion and local file handling.
- Updated `README-loongsuite.rst` with detailed descriptions of new features and configuration options.
- Refactored `MultimodalProcessingMixin` and `MultimodalPreUploader` to improve shutdown handling and configuration management.
- Added utility functions for environment variable parsing related to multimodal features.
- Removed the `GenAIShutdownProcessor` class as it is no longer needed.
- Enhanced tests for multimodal upload hooks and audio conversion functionality.
Copy link
Copy Markdown
Collaborator

@Cirilla-zmh Cirilla-zmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Cirilla-zmh Cirilla-zmh merged commit a26e3b5 into alibaba:main Feb 27, 2026
532 of 696 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request genai The genai label represents issues related to generative AI. infra The infra label represents issues related to Infrastructure

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants